Anatomy of Annotation Schemes: Mapping to GrAF
نویسندگان
چکیده
In this paper, we apply the annotation scheme design methodology defined in (Bunt, 2010) and demonstrate its use for generating a mapping from an existing annotation scheme to a representation in GrAF format. The most important features of this methodology are (1) the distinction of the abstract and concrete syntax of an annotation language; (2) the specification of a formal semantics for the abstract syntax; and (3) the formalization of the relation between abstract and concrete syntax, which guarantees that any concrete syntax inherits the semantics of the abstract syntax, and thus guarantees meaning-preserving mappings between representation formats. By way of illustration, we apply this mapping strategy to annotations from ISOTimeML, PropBank, and FrameNet.
منابع مشابه
Towards Interoperability for the Penn Discourse Treebank
The recent proliferation of diverse types of linguistically annotated schemes coded in different representation formats has led to efforts to make annotations interoperable, so that they can be effectively used towards empirical NL research. We have rendered the Penn Discourse Treebank (PDTB) annotation scheme in an abstract syntax following a formal generalized annotation scheme methodology, t...
متن کاملAn annotation scheme for Persian based on Autonomous Phrases Theory and Universal Dependencies
A treebank is a corpus with linguistic annotations above the level of the parts of speech. During the first half of the present decade, three treebanks have been developed for Persian either originally or subsequently based on dependency grammar: Persian Treebank (PerTreeBank), Persian Syntactic Dependency Treebank, and Uppsala Persian Dependency Treebank (UPDT). The syntactic analysis of a sen...
متن کاملImporting MASC into the ANNIS linguistic database: A case study of mapping GrAF
This paper describes the importation of Manually Annotated Sub-Corpus (MASC) data and annotations into the linguistic database ANNIS, which allows users to visualize and query linguistically-annotated corpora. We outline the process of mapping MASC’s GrAF representation to ANNIS’s internal format relANNIS and demonstrate how the system provides access to multiple annotation layers in the corpus...
متن کاملBridging the Gaps: Interoperability for GrAF, GATE, and UIMA
This paper explores interoperability for data represented using the Graph Annotation Framework (GrAF) (Ide and Suderman, 2007) and the data formats utilized by two general-purpose annotation systems: the General Architecture for Text Engineering (GATE) (Cunningham, 2002) and the Unstructured Information Management Architecture (UIMA). GrAF is intended to serve as a “pivot” to enable interoperab...
متن کاملThe Linguistic Annotation Framework: a standard for annotation interchange and merging
This paper overviews the International Standards Organization Linguistic Annotation Framework (ISO LAF) developed in ISO TC37 SC4. We describe the XML serialization of ISO LAF, the Graph Annotation Format (GrAF) and discuss the rationale behind the various decisions that were made in determining the standard. We describe the structure of the GrAF headers in detail and provide multiple examples ...
متن کامل